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ABSTRACT 


Penstemon oklahomensis is an endemic to the southern plains region that occurs in Oklahoma and one location in Texas. As an aid to conser- 
vation of this species, we used a species distribution model to map the possible extent of Penstemon oklahomensis in Oklahoma. Location 
data were derived from specimens in the Oklahoma Vascular Plants Database and occurrence records maintained by the Oklahoma Natural 
Heritage Inventory. We then modeled the potential distribution of P. oklahomensis with MaxEnt, a technique suitable for presence-only data. 
Resulting maps were used to field validate the model and determine ifa gap in the southwestern portion of its range was a sampling artifact, 
resulting in two new first county records for the species. 


RESUMEN 


Penstemon oklahomensis es un endemismo del sur de la región de las llanuras que se da en Oklahoma y una localidad de Texas. Como ayuda 
para la conservación de esta especie, usamos un modelo de distribución de especies para mapear la posible extension de Penstemon oklaho- 
mensis en Oklahoma, La posición de los datos se derivó de los especimenes existentes en la Oklahoma Vascular Plants Database y citas de 
ocurrencia mantenidos por el Oklahoma Natural Heritage Inventory. Luego modelamos la distribución potencial de P. oklahomensis con 
MaxEnt, una técnica apropiada para datos de solo presencia. Los mapas resultantes se usaron en el campo para validar el modelo y determi- 
nar si un hueco en la porción suroeste de su rango era un artefacto de muestreo, resultando en dos primeras citas de la especie para el 


condado. 


INTRODUCTION 


Advances in geospatial technologies and increased accessibility of species collection data have placed the map- 
ping of species’ geographical ranges at the forefront of biogeographic research and conservation planning 
(Guisan & Thuiller 2005; Franklin 2009; Peterson et al. 201 1). Location information from natural history col- 
lections has become prevalent online in large databases (Elith et al. 2006; Newbold 2010; Peterson et al. 2011), 
allowing for easier mapping of a species’ current and historic localities (Graham et al. 2004; Newbold 2010; 
Peterson et al. 2011). Likewise, efforts to collect precise location data, with the assistance of GPS, also have 
improved in recent years. Models of species distribution have provided insight into; generated hypotheses 
about a species’ ecology, and aided in the location of new populations (Austin 2002; Hirzel et al. 2002; Guisan 
& Thuiller 2005; Franklin 2009; Lobo et al. 2010; Newbold 2010; Naimi et al. 2011). 

The objective of this study was to develop a predictive range map for the distribution ot Penstemon oklaho- 
mensis Pennell (Plantaginaceae). The genus Penstemon contains approximately 237 species and is one of JA 
largest plant genera in North America (Freeman In prep; Lindgren & Wilde 2003; Nold 1999). Althoug = 
oklahomensis is one of 13 species of Penstemon that occur in Oklahoma, it is a unique regional ser tot : 
southern plains region. It has been documented in 24 Oklahoma counties (Hoagland etal. oo In me = 
known populations were restricted to central Oklahoma until the recent discovery of a popu ae in No p 
Texas (Mink et al. 2010). Penstemon oklahomensis is a native perennial that flowers from April to mt cae = 
is one of only four species of Penstemon with a closed throat floral morphology. Of these four — : 0 A 
mensis has the most restricted distribution (Clements et al. 1998; Pennell 1935). Penstemon oktahomensis mos 


J. Bot. Res. Inst. Texas 7(2): 891 - 899. 2013 


892 Journal of the Botanical Research Institute of Texas 7(2) 


frequently occurs in remnant Tallgrass prairie, but has also been found in other prairie types as well as open 
woodlands (Hoagland et al. 2012). The Oklahoma Natural Heritage Inventory tracks P. oklahomensis as a state 
rare species (S1). At the global level, it is ranked as a G3 (either very rare and local throughout its range or found 
locally, even abundantly at some of its locations) and S3 (rare and local distributed within Oklahoma) (Okla- 
homa Natural Heritage Inventory 2012). 

This project is also intended to contribute to our understanding of the ecology of P. oklahomensis, for 
which there are few published studies. A recent study of Penstemon oklahomensis habitat, indicated the soils 
where populations occur ranged from sandy loam to loam with a pH range of 5.5-7.6 and relatively low nitro- 
gen, phosphorous, and potassium levels. The same study also found P. oklahomensis populations to persist in 
grassy roadside areas that are disturbed through various mowing regimes (Messick & Hoagland 2012). 


Study Area 

The study area encompasses the state of Oklahoma, although populations of P. oklahomensis have not been 
documented in western parts of the state, congeners occur in all regions. We took this approach to determine 
the regional extent of potential habitat. The long axis of Oklahoma has an east-west orientation that spans 6.5 
degrees of longitude (from 94°30'W to 103°W) and 3.25 degrees of latitude (33°30'N to 37°N). Along this axis 
are two important environment gradients; elevation and precipitation. Elevation decreases from 1,516 m in the 
northwest, at Black Mesa, to 110 m in the southeast, where the Little River exits the state into Arkansas. Aver- 
age annual precipitation also follows a northwest to southeast gradient, with the lowest values in the northwest 
(43 cm) and the highest in the southeast (142 cm). There is a weak south-north gradient in temperature. The 
length of the growing season ranges from 225-230 days along the Red River and 175 days on the border with 
Kansas. Average annual temperature increases roughly from 13.3°C in the northwest to 16.7°C in the southeast 
(Johnson & Duchon 1995). 


METHODS 


The successful analysis of species distribution relies upon the compilation of numerous layers of geospatial 
data. The primary dataset for such analyses is location data, preferably in a geographic coordinates, derived 
from specimen data. Location data for P. oklahomensis was compiled from specimen voucher data from the 
Oklahoma Vascular Plants Database (OVPD) (Hoagland et al. 2012), the Oklahoma Natural Heritage Inventory 
(ONHI) (Oklahoma Natural Heritage Inventory 2012), and other sources (Freeman 1981). As noted earlier, a 
population of P. oklahomensis has been reported from northeastern Texas, which is 51.5 km from the nearest 
Oklahoma population (Mink et al. 2010), and was excluded from this analysis due to a lack of detailed informa- 
tion on the population in question and access to geospatial data for Texas. We recognize the importance of this 
population, however, and encourage the exploration of intervening areas between the Texas and Oklahoma 
populations 

Once extracted, location data were compiled into a geodatabase and edited to remove duplicate records. 
Duplicate records were found primarily in the OVPD and are a byproduct of specimen exchanges between in- 
state institutions. Duplicate records also existed between the OVPD and the ONHI database. Next, geographic 
precision of the records was assessed. Geographical coordinates were not provided on the majority of herbari- 
um vouchers predating 2000, but either driving directions and/or land survey references (e.g., township, range, 
and section) were recorded. Thus it was necessary to manually assign geographical coordinates. Specimens 
that listed only the county or equally vague geographic reference (e.g., Indian Territory) were excluded from 
analysis. The resulting dataset for analysis consisted of 142 location points (Fig. 1). 

When mapping species distributions, itis important to examine both the extent of occurrence (EOO) and 
area of occupancy (AOO). The EOO represents the entire area in which a species has been found, including 
gaps between populations, and is bounded by the outermost occurrences of a species. The gaps between popu- 
lations may simply represent inadequate sampling effort or are possibly areas of unsuitable habitat. The EOO 
for a species can be mapped using the convex hull operation. The resulting map is a more accurate depiction of 
a species distribution than one created using rectangles or circles encompassing all known locations of a spe- 


Messick and Hoagland, Distribution modeling of Penstemon oklahomensis 893 


an T 


SF 
N mai 
AA. I 

AN ! 

= 

c3 

W 

4 


A 
NY 
E 
S: 
AY 
ann 
Ç 
r 


PE 4 
DANS 
my 
Lidar 
AT 
AN 
aa 
i 

X 


N AN 
À P 


Fic. 1. Point distribution map of Penstemon oklahomensis in Oklahoma, includes interstate and state highways before modeling. 


cies (Podani 2009). The AOO is the area where the target species is actually found and is either equal to or 
smaller than the EOO. The AOO does not include gaps between populations. Removing gaps or discontinuities 
in an EOO results in the AOO for a species (Gaston & Fuller 2009). 

The EOO map of P. oklahomensis was generated using the Convex Hull module of ArcGIS 10 using the 
species occurrence data layer. Upon inspection, the resulting map exhibited a significant gap in the species 
distribution between central and southwestern Oklahoma. Thus, we repeated the convex hull operation so 
that the southwestern Oklahoma collection points (n=15) were aggregated into a polygon separate from a 
larger central Oklahoma area polygon. 

Maximum entropy (MaxEnt), the method most frequently used in species distribution modeling, was 
chosen for modeling the distribution of P. oklahomensis. (Franklin 2009; Naimi et al. 2011). MaxEnt was de- 
signed specifically for use with presence-only data, such as the P. oklahomensis dataset, and can analyze small 
sample sizes (< 100 samples) and overcome sampling bias (Franklin 2009). 

MaxEnt analyzes species occurrence data in conjunction with a suite of environmental data to calculate 
an index of relative suitability for a species (Graham et al. 2008; Anderson & Gonzalez 2011; Elith et al. 2011). 
Environmental factors are independent variables and are referred to as covariate or predictor variables. The 
environmental variables used in this study were elevation, slope, aspect, land cover type, soil order, soil series, 
geology, mean minimum annual temperature, mean maximum annual temperature, and mean annual precipi- 
tation (Table 1). Slope and aspect were derived from a 30 m DEM using ArcGIS Spatial Analyst Toolbox and 
then clipped to the political boundaries of Oklahoma. All of the remaining environmental variables were ac- 
quired as vector data and were converted to raster format to match the extent and scale of the DEM. 

MaxEnt attempts to derive a log-linear model that is dependent on the presence points and a set of se- 
lected randomly from the environmental data layers, referred to as background points, to estimate the proba- 
bility of an occurrence or population in a locality. Should an actual species presence point be selected as a 
background point, then the environmental features are rescaled on a scale of 0-1, and an error boundary for 
the point is calculated. The error bounds are calculated from the environmental features rather than the pres- 


ence points because the presence points are often biased (Elith et al. 2011). 


Model evaluation 


The output of MaxEnt is a probability of species occurrence based on the concept of maximum entropy or 
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Taste 1. Environmental variables used within MaxEnt models for potential distribution mapping of Penstemon oklahomensis. 


Variable Range & Unit Source 

Aspect 0-359° Oklahoma Digital Elevation Model (www.csa.ou.edu) 

Elevation 87-806 m Oklahoma Digital Elevation Model (www.csa.ou.edu) 

Geology 156 categories Geologic Map of the United States (www.mrdata.usgs.gov) 

Land Cover 15 categories National Land Cover Database (www.mrlc.gov) 

Mean Max Temp 30.5-37.2°C (87-99°F) Prism Climate Group (www.prismclimate.org) 

Mean Min Temp -8.3-0.6°C (17-33°F) Prism Climate Group (www.prismclimate.org) 

Mean Annual Precipitation 43.2-180.3 cm (17-71 in) Prism Climate Group (www.prismclimate.org) 

Slope 0-57° Oklahoma Digital Elevation Model (www.csa.ou.edu) 

Soil Order 7 categories STATSGO (Soil Survey Staff 2005; soils.usda.gov/survey/geography/statsgo) 
Soil Series Association 228 categories STATSGO (Soil Survey Staff 2005; soils.usda.gov/survey/geography/statsgo) 


whether a pattern of occurrence is uniform across the landscape given the environmental variables used in the 
model. A model is selected from replicates that have the highest test area under the curve (AUC) (Elith et al. 
2006; Phillips et al. 2006; Franklin 2009; Elith et al. 2011; Warren & Seifert 2011).The area under the curve 
(AUC) of a receiver-operating characteristic (ROC) plot is a threshold-independent metric (Franklin 2009; 
Jimenez-Valverde 2012). A ROC plot graphs “the false-positive error rate on the x-axis (1 - Specificity) versus 
the true positive rate on the y-axis (Sensitivity) based on each possible value of threshold probability” (Frank- 
lin 2009). The AUC is calculated from the resulting curve and can range from 0.5 to 1.0. The value 0.5 repre- 
sents random predictions while values above 0.5 represent “performance better than random” (Franklin 2009; 
Jimenez-Valverde 2012). An AUC value between 0.5 and 0.7 indicates low or poor performance, between 0.7 
and 0.9 indicates moderate performance, and values greater than 0.9 indicate high performance (Swets 1988; 
Franklin 2009). 

We used MaxEnt version 3.3.3e modeling software (www.cs.princeton.edu/~schapire/maxent) to model 
the potential distribution of P. oklahomensis. The analysis was run with 0%, 10%, 20%, 30%, 40%, and 50% of 
the P. oklahomensis point locations withheld for testing the model. Collectively these model runs were called 
Model Set A. For each percentage category for which points were withheld, 15 replicates were generated. Re- 
sponse curves, jackknife of variable importance, and maps of predicted distributions were also generated. The 
jackknife of variable importance identifies the individual variable(s) that were most important in predicting 
the species’ distribution (Elith etal. 2011). In order to evaluate the potential outlier affect of the P. oklahomensis 
occurrence in eastern Oklahoma, the analysis was conducted a second time and the resulting models were re- 
ferred to as Model Set B. 

MaxEnt created grids for the average, minimum, maximum, median, and standard deviation of the pre- 
dictions for each percentage category withheld run based on the test AUC value. The average prediction grids 
were converted to raster files and the resulting prediction AUC values were then compared. Based on the pre- 
diction maps, the gap in the distribution between the southwestern populations and the central populations 
was surveyed for P. oklahomensis populations. This area included three counties; Grady County, Stephens 
County, and Jefferson County. Ifa P. oklahomensis population was discovered, voucher specimen was collected 
and deposited at the Robert Bebb Herbarium (OKL) at the University of Oklahoma, Norman, OK. 


RESULTS AND DISCUSSION 


The county-level map of the 142 Penstemon oklahomensis (Fig. 1) location points revealed that the majority of 
collection points were both in central Oklahoma and clustered near interstate or state highways. To verify this 
pattern, we calculated Moran's I (I), which proved a significant pattern (I = 0.371, z score = 3.385 p = 0.001). 
Also, the AOO for this species was much smaller in area than the FOO and as noted earlier have were twO 
noteworthy gaps in the EOO; the first in the southeast and a second in the southwest (Fig. 2). Since the gap in 
the southwest was more pronounced geographically and was represented by a greater number of occurrences 
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Fic. 2. Extent of occurrence (E00) and area of occupancy (A00) for Penstemon oklahomensis within Oklahoma before modeling. 


(n=15) than the southeast (n=1), it became the focus of our analysis and groundtruthing. Our goal was to ascer- 
tain whether this was a true gap in distribution or a sampling artifact. 

The training and test AUC values for both model sets are listed in Table 2. Model 40 A had the highest 
training AUC (0.954) while Model 10 A had the lowest training AUC (0.944). For the test data, Model 10 A had 
the highest AUC (0.907) and Model 50 A had the lowest AUC (0.889). From the test data used in Model 10 A, 
the jackknife of the environmental variables showed geology (25.6% contribution) and soil series association 
(20.6% contribution) to be the most important (Table 3). Model 50 B had the highest training AUC (0.953) and 
Model 0 B had the lowest training AUC (0.943). For the test data, Model 20 B had the highest AUC (0.900) and 
Model 30 B had the lowest AUC (0.886). The jackknife of environmental variables from the test data used in 
Model 20 B also showed geology (27.7% contribution) and soil series association (22.2% contribution) to be the 
most important (Table 4). Model 20 B (Fig. 3) was selected as the best predictive map for the potential distribu- 
tion of P. oklahomensis within Oklahoma because of its AUC value even though the value is the cut-off value 
(0.900) between moderate and high performance according to Swets’ scale (1988). 


Groundtruthing 


MaxEnt predicted greater 
ern Grady County and another in southern Grady 
The two predicted northern locations were surveye 
pasture. A new population (Fig. 4) of P. oklahomensis 


than 25% probability of occurrence of P. oklahomensis populations in extreme north- 
County, a location within the southwestern distribution gap. 
d, but one proved to be a wheat field and the other a grazed 
was located, however, at the southern location where the 


model predicted 25%-—49% probability of occurrence. Another locality within the portion of Stephens County 
in the gap, with a greater than 25% probability of an occurrence was surveyed, but no populations were found. 
A predicted location in northern Stephens County, with less than a 25% probability of occurrence, did produce 
a new population (Fig. 4). Additional surveys of counties in the southwestern gap did not yield new popula- 
tions of P. oklahomensis. 

The point locations of the new populations were added to the overall point distribution map and the prob- 
ability values extracted from the model. The Grady County population probability value was 0.21 and the value 


for the Stephens County population was 0.03. The two new location points were added to the same dataset used 


to produce Model Set B, and MaxEnt run again with the same settings to produce Model Set C. The AUC 
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Taste 2. Mean AUC values for training and test points for each replicate set of models run. In the lower section, the minimum AUC value indicates worst performing 
model and maximum AUC value represents the best performing model. 


—______ LLL ĖŮ 


Model % Mean Mean Model % Mean Mean 
Name Withheld Training Test Name Withheld Training Test 
AUC AUC AUC AUC 
OA 0 0.947 0.897 0B 0 0.943 0.890 
10A 10 0.944 0.907 10B 10 0.944 0.896 
20A 20 0.950 0.906 20B 20 0.951 0.900 
30A 30 0.952 0.895 30B 30 0.948 0.886 
40A 40 0.954 0.898 40B 40 0.950 0.891 
50A 50 0.952 0.889 50B 50 0.953 0.892 
Mean 0.950 0.899 Mean 0.948 0.892 
Min 0.944 0.889 Min 0.943 0.886 
Max 0.954 0.907 Max 0.953 0.900 
Std. Dev. 0.004 0.007 Std. Dev. 0.004 0.005 


Taste 3. Variable contributions with percent contribution and permutation importance values for Model 10 A. 


Variable Percent Permutation 
Contribution Importance 

Geology 25.6 17.7 

Soil Series 20.6 13.9 
Precipitation 197 7.6 

Land Cover 13.2 8.6 

Soil Order 5.4 6.5 
Slope 4.9 5.3 

Min Temp 48 15.9 
Elevation 48 20.6 
Aspect 0.5 0.8 

Max Temp 0.4 3.0 


a a SE 


Tase 4. Variable contributions with percent contribution and permutation importance values for Model 20 B, 


Variable Percent Permutation 
Contribution Importance 

Geology 27.7 14.7 

Soil Series 22.2 17.7 

Land Cover 13.0 98 

Precipitation 10.7 3.2 

Soil Order 9.5 7.7 

Elevation 5.6 13.4 

Min Temp 5.3 20.0 

Slope 5.1 10.2 

Max Temp 0.5 2.8 

Aspect 0.5 0.7 


values of Model Set C are listed in Table 5. Adding the new location points to the dataset lowered the AUC val- 
ues of all models into the moderate performance range, contrary to expectation. 
CONCLUSION 


Our objective was to glean a better understanding of factors controlling the distribution of Penstemon oklaho- 
mensis across its geographic range. Data for this effort was collected from Freeman (1981), OVPD (Hoagland et 


Messick and Hoagland, Distribution modeling of Penstemon oklahomensis 897 


H 
F 


Percent Probability 
of Presence 


C ]1-2 


HE 25-49 
GW 50-74 
HE 75-50 
HMB 20-100 


Percent Probability 
of Presence 


E Ji- 

E 25-49 
E <0 -74 
HM 75-59 
BM 20 - 100 


Fic. 4. Counties representing the gap in Penstemon oklahomensis distribution; percent probability of presence from Model 20 B prediction shown. Stars 
indicate location of newly located populations. 
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Taste 5. AUC values for Model Set C. 


Model Name % Test 
Withheld AUC 
oc 0 0.8944 
10C 10 0.8859 
20C 20 0.8819 
30C 30 0.8837 
40C 40 0.8877 
50C 50 0.8851 


al. 2012), and ONHI. Our initial observation was that the highly clustered distribution pattern of P. oklahomen- 
sis was a result of collector bias, which was correlated with ease of access due to roads. Further, collection loca- 
tions were also clustered near cities with universities and in recreational areas. 

We then analyzed the relationship of the distribution to environmental factors using MaxEnt. The result- 
ing maps of potential probability of occurrence AUC values were deemed accurate, particularly those of the 
Model 20 B (AUC = value of 0.90), the cut-off value between moderate and high performance (Swets 1988; 
Franklin 2009). Groundtruthing of this model results lead us to two new populations within a “gap” in the 
range of P. oklahomensis. The low probability of occurrence values for the two new populations, however, sug- 
gest that the predictor values used in the model may not be specific enough to locate additional populations P. 
oklahomensis in southwest Oklahoma. Choosing the appropriate scale and type of predictor variables might be 
confounded by the fact that P. oklahomensis exhibits relatively broad ecological tolerances. Messick and Hoa- 
gland (2012), for example, documented that the greatest abundance of individuals of P. oklahomensis was as 
likely to occur in highway medians as in pastures dominated by native grasses. 

Future surveys for P. oklahomensis should be conducted to further evaluate the performance of the model. 
It is important to note that during this study, the Southern Plains were experiencing a severe drought, which in 
turn affected the number of stems present in known populations of P. oklahomensis. Thus we suggest the fail- 
ure to find new populations within predicted areas could have been partially the result of drought. 
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